glstring

using the get_ functions

Each of these functions take a GL String as an argument


In [1]:
import glstring
print(glstring.__file__)


/Users/bmilius/.virtualenvs/p36/lib/python3.6/site-packages/glstring/__init__.py

In [8]:
from glstring.glstring import *
a = "HLA-A*01:01/HLA-A*01:02+HLA-A*24:02|HLA-A*01:03+HLA-A*24:03^HLA-B*44:01+HLA-B*44:02"
print(a)


HLA-A*01:01/HLA-A*01:02+HLA-A*24:02|HLA-A*01:03+HLA-A*24:03^HLA-B*44:01+HLA-B*44:02

get_alleles() & get_loci()

  • Each of these functions returns a set of objects.

In [9]:
get_alleles(a)


Out[9]:
{'HLA-A*01:01',
 'HLA-A*01:02',
 'HLA-A*01:03',
 'HLA-A*24:02',
 'HLA-A*24:03',
 'HLA-B*44:01',
 'HLA-B*44:02'}

In [10]:
get_loci(a)


Out[10]:
{'HLA-A', 'HLA-B'}

get_allele_lists(), get_genotypes(), get_genotype_lists(), get_locus_blocks(), get_genotype_blocks(), & get_genotype_list_blocks

Each of these functions return a list of objects found in the GL String.

  • Locus blocks are separated by a ^.

In [11]:
get_locus_blocks(a)


Out[11]:
['HLA-A*01:01/HLA-A*01:02+HLA-A*24:02|HLA-A*01:03+HLA-A*24:03',
 'HLA-B*44:01+HLA-B*44:02']
  • Genotype list blocks are separated by |

In [13]:
get_genotype_list_blocks(a)


Out[13]:
['HLA-A*01:01/HLA-A*01:02+HLA-A*24:02', 'HLA-A*01:03+HLA-A*24:03']
  • Genotype blocks are separated by +

In [14]:
get_genotype_blocks(a)


Out[14]:
['HLA-A*01:01/HLA-A*01:02',
 'HLA-A*24:02',
 'HLA-A*01:03',
 'HLA-A*24:03',
 'HLA-B*44:01',
 'HLA-B*44:02']
  • Genotype lists are found in locus blocks. The contain | delimiters, which separate the possible genotypes. There is one genotype list in this example.

In [15]:
get_genotype_lists(a)


Out[15]:
['HLA-A*01:01/HLA-A*01:02+HLA-A*24:02|HLA-A*01:03+HLA-A*24:03']
  • Genotypes contain a + delimiter and may contain allele lists

In [16]:
get_genotypes(a)


Out[16]:
['HLA-A*01:01/HLA-A*01:02+HLA-A*24:02',
 'HLA-A*01:03+HLA-A*24:03',
 'HLA-B*44:01+HLA-B*44:02']
  • Allele lists contain a / delimiter

In [17]:
get_allele_lists(a)


Out[17]:
['HLA-A*01:01/HLA-A*01:02']

A more complex example


In [31]:
a = ("HLA-A*01:01/HLA-A*01:02+HLA-A*24:02|HLA-A*01:03+HLA-A*24:03^"
 "HLA-B*08:01+HLA-B*44:01/HLA-B*44:02^"
 "HLA-C*01:02+HLA-C*01:03^"
 "HLA-DRB5*01:01~HLA-DRB1*03:01+HLA-DRB1*04:07:01/HLA-DRB1*04:92~HLA-DRB1*03:01")
print(a)


HLA-A*01:01/HLA-A*01:02+HLA-A*24:02|HLA-A*01:03+HLA-A*24:03^HLA-B*08:01+HLA-B*44:01/HLA-B*44:02^HLA-C*01:02+HLA-C*01:03^HLA-DRB5*01:01~HLA-DRB1*03:01+HLA-DRB1*04:07:01/HLA-DRB1*04:92~HLA-DRB1*03:01

In [32]:
get_loci(a)


Out[32]:
{'HLA-A', 'HLA-B', 'HLA-C', 'HLA-DRB1', 'HLA-DRB5'}

In [33]:
get_alleles(a)


Out[33]:
{'HLA-A*01:01',
 'HLA-A*01:02',
 'HLA-A*01:03',
 'HLA-A*24:02',
 'HLA-A*24:03',
 'HLA-B*08:01',
 'HLA-B*44:01',
 'HLA-B*44:02',
 'HLA-C*01:02',
 'HLA-C*01:03',
 'HLA-DRB1*03:01',
 'HLA-DRB1*04:07:01',
 'HLA-DRB1*04:92',
 'HLA-DRB5*01:01'}

In [34]:
get_allele_lists(a)


Out[34]:
['HLA-A*01:01/HLA-A*01:02',
 'HLA-B*44:01/HLA-B*44:02',
 'HLA-DRB1*04:07:01/HLA-DRB1*04:92']

In [35]:
get_genotypes(a)


Out[35]:
['HLA-A*01:01/HLA-A*01:02+HLA-A*24:02',
 'HLA-A*01:03+HLA-A*24:03',
 'HLA-B*08:01+HLA-B*44:01/HLA-B*44:02',
 'HLA-C*01:02+HLA-C*01:03',
 'HLA-DRB5*01:01~HLA-DRB1*03:01+HLA-DRB1*04:07:01/HLA-DRB1*04:92~HLA-DRB1*03:01']

In [36]:
get_genotype_lists(a)


Out[36]:
['HLA-A*01:01/HLA-A*01:02+HLA-A*24:02|HLA-A*01:03+HLA-A*24:03']

In [37]:
get_locus_blocks(a)


Out[37]:
['HLA-A*01:01/HLA-A*01:02+HLA-A*24:02|HLA-A*01:03+HLA-A*24:03',
 'HLA-B*08:01+HLA-B*44:01/HLA-B*44:02',
 'HLA-C*01:02+HLA-C*01:03',
 'HLA-DRB5*01:01~HLA-DRB1*03:01+HLA-DRB1*04:07:01/HLA-DRB1*04:92~HLA-DRB1*03:01']

In [38]:
get_genotypes(get_locus_blocks(a)[0])


Out[38]:
['HLA-A*01:01/HLA-A*01:02+HLA-A*24:02', 'HLA-A*01:03+HLA-A*24:03']

In [39]:
get_genotypes(get_locus_blocks(a)[1])


Out[39]:
['HLA-B*08:01+HLA-B*44:01/HLA-B*44:02']

In [40]:
get_allele_lists(get_genotypes(get_locus_blocks(a)[0])[0])


Out[40]:
['HLA-A*01:01/HLA-A*01:02']

In [41]:
get_alleles(get_allele_lists(get_genotypes(get_locus_blocks(a)[0])[0])[0])


Out[41]:
{'HLA-A*01:01', 'HLA-A*01:02'}

In [42]:
get_haplotypes(a)


Out[42]:
['HLA-DRB5*01:01~HLA-DRB1*03:01',
 'HLA-DRB1*04:07:01/HLA-DRB1*04:92~HLA-DRB1*03:01']

In [ ]:


In [ ]: